Antipattern: Solve a Complex Problem in One Step

Let's look into the drawbacks of retrieving results in one go in detail.

We'll cover the following

Unintended products
As though that weren’t enough

SQL is a very expressive language — you can accomplish a lot in a single query or statement. But that doesn’t mean it’s mandatory or even a good idea to aim to solve every problem in one line of code. Do we have this habit with any other programming language we use? Probably not.

Unintended products#

One common consequence of producing all our results in one query is something called a Cartesian product. This happens when two of the tables in the query have no condition restricting their relationship. Without such a restriction, the JOIN for the two tables pairs each row in the first table to every row in the other table. Each such pairing becomes a row of the result set, and we end up with many more rows than we expected.

Let’s see an example. Suppose we want to query our bugs database to count the number of bugs, fixed and open, for a given product. Many programmers would try to use a query like the following to calculate these counts:

Enter to Rename, Shift+Enter to Preview

Retrieving all the results with a single query

In reality, we happen to know that there are eleven fixed bugs and seven open bugs for the given product. So, the result of the query is puzzling:

product_id	count_fixed	count_open
1	77	77

What caused this to be so inaccurate? It’s no coincidence that 77 is 11 times 7. This example joins the Products table to two different subsets of Bugs, but this results in a Cartesian product between those two sets of bugs. Each of the twelve rows for FIXED bugs is paired with all seven rows for OPEN bugs.

We can visualize the Cartesian product graphically below. Each line connecting a fixed bug to an open bug becomes a row in the interim result set (before grouping is applied).

We can see this interim result set by eliminating the GROUP BY clause and aggregate functions.

Enter to Rename, Shift+Enter to Preview

Retrieving results using a single query

The only relationships expressed in this query are between the BugsProducts table and each subset of Bugs. No conditions restrict every FIXED bug from matching with every OPEN bug, and the default result is that they do. The result produces twelve times seven rows.

It’s all too easy to produce an unintentional Cartesian product when we try to make a query do double-duty like this. If we try to do more unrelated tasks with a single query, the total could be multiplied by yet another Cartesian product.

As though that weren’t enough#

Besides the fact that we can get the wrong results, it’s important to consider that these queries are simply hard to write, hard to modify, and hard to debug. We should expect to get regular requests for incremental enhancements to our database applications. Managers want more complex reports and more fields in a user interface. If we design intricate, monolithic SQL queries, it’s more costly and time-consuming to make enhancements to them. After all, our time is worth something, both to us and to our projects.

There are runtime costs, too. An elaborate SQL query that uses many JOIN operations, correlated subqueries, and other operations is harder for the SQL engine to optimize and execute quickly than a more straightforward query. Programmers instinctively feel that the fewer the SQL queries, the better the performance. This is true, assuming that the SQL queries in question are of equal complexity. On the other hand, the cost of a single monster query can become extremely large and it can be much more economical to use several simpler queries, instead.

Synopsis: Spaghetti Query

Solution: Divide and Conquer

Mark as Completed

Report an Issue

Introduction

Logical Antipattern - Jaywalking

Logical Antipattern - Naive Trees

Logical Antipattern - ID Required

Logical Antipattern - Keyless Entry

Assessment: Logical Antipatterns - Part 1

Logical Antipattern - Entity-Attribute-Value

Logical Antipattern - Polymorphic Associations

Logical Antipattern - Multicolumn Attributes

Logical Antipattern - Metadata Tribbles

Assessment: Logical Antipatterns - Part 2

Physical Antipattern - Rounding Errors

Physical Antipattern - 31 Flavors

Physical Antipattern - Phantom Files

Physical Antipattern - Index Shotgun

Assessment: Physical Antipatterns

Query Antipattern - Fear of the Unknown

Query Antipattern - Ambiguous Groups

Query Antipattern - Random Selection

Query Antipattern - Poor Man’s Search Engine

Query Antipattern - Spaghetti Query

Query Antipattern - Implicit Columns

Assessment: Query Antipattern

Application Development Antipattern - Readable Passwords

Application Development Antipattern - SQL Injection

Application Development Antipattern - Pseudokey Neat-Freak

Application Development Antipattern - See No Evil

Application Development Antipattern - Diplomatic Immunity

Application Development Antipattern - Magic Beans

Assessment: Application Development Antipattern

Conclusion

Antipattern: Solve a Complex Problem in One Step

Unintended products#

As though that weren’t enough#